home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
The CICA Windows Explosion!
/
The CICA Windows Explosion! - Disc 2.iso
/
programr
/
rtx2000.zip
/
rtx2000
/
fc
/
fc.doc
next >
Wrap
Text File
|
1993-06-08
|
45KB
|
1,531 lines
fc - A Public Domain FORTH Cross Compiler for the RTX2000
Microcontroller
Version 1.14
fc is a FORTH compiler that generates machine code for the
Harris RTX2000 microcontroller chip. fc accepts text files generated
by a text editor and produces optimized object code files for the
RTX2000. fc has options for generating annotated object code listings
and cross reference files which can be incorporated into other
programs to link variable and word definitions. fc can also be used
to produce code for the Harris RTX2010 and RTX2010 specific
instructions can be added using the #macro and ucode features.
fc currently runs on an AMIGA or IBM PC compatible computer
with either a hard disk or ram disk. fc has been tested using
AmigaDos 1.3 and MS-DOS 5.0 but, may work with other versions of
either operating system.
fc was made possible by the Public Domain version of Berkely
yacc available on Fred Fish Disk #419. Many thanks to Bob Corbett and
those who helped make this public domain version of yacc.
I would also like to thank John Goldsten and Augie Mattheiss
for their help in debugging and testing the compiler. John, Augie,
and the author have developed several programs using the MS-DOS and
AMIGA versions of the compiler that have been successfully executed on
RTX2000 target systems. We normally use the -o output option or the -
e output option for EPROMs. The EPROM output option has been
successfully used with a DATAIO UNISITE programmer.
fc is a Public Domain program so no fee should be charged for
distribution except for a possible media charge. fc was developed
using Aztec C for AMIGA version 3.6A with the default 16 bit integers.
fc has been ported to MS-DOS and the MS-DOS executable is included in
the distribution. That version is identical to the AMIGA version
except for some details pertaining to memory allocation and the naming
of some of the source file names. An experimental Macintosh version
may also be included in the distribution and it will require MPW.
fc is currently being used for software development future
revisions are likely. We already have had several months experience
producing and executing code generated by fc. However, fc is offered
"as is" and will not necessarily be supported. Anyone wishing to
develop large programs for the RTX2000 or RTX2010 may wish to look
into other development systems or cross compilers from one of the
several vendors marketing them.
A test file is included with the distribution to demonstrate
code production of the compiler. This output has been checked against
the instruction set listed in the Harris RTX2000 Programmer's
Reference Manual. However, users should check the code production in
a disassembly output should they run into any really puzzling problems
when debugging their programs. We have developed several programs up
to 4K bytes long and have not run into any code production problems
for the last couple of months.
Background
The RTX2000 microcontroller is a 16 bit microprocessor
architectureally designed for executing the FORTH language. The
RTX2000 instruction set corresponds to FORTH primitive words (like
SWAP, DROP, @, !, etc) and combinations of primitive words which, in
some cases, allow several FORTH instructions to be encoded into a
single RTX2000 instruction. All RTX2000 instructions execute in one
or two cycles and are sixteen or thirty-two bits long. It is common
to run the RTX2000 at rates of 8MHz to 10MHz making it fast for real
time control applications.
The RTX2000 chip includes two onboard stacks (256 deep), one is
used for parameters and the other is used mainly for subroutine
returns. This allows stack operations like SWAP, DROP, etc, to be
performed without an external memory access. Likewise subroutine
returns are done using the onboard return stack making subroutine
overhead quite low. In fact, most RTX2000 instructions can be coded
to perform a return as part of the instruction so, returns often don't
even require any extra code.
The RTX2000 has several onboard peripherals, including three
timer/counters and an interrupt controller. For more information on
the RTX2000, consult the data sheet from Harris semiconductor.
Potential users are strongly advised to investigate Harris' plan for
future support of the RTX2000 family before committing to the chip.
fc should also be compatible with the Harris RTX2010RH which is
sold as an ASIC. The RTX2010RH features higher radiation tolerance
then the RTX2000, a barrel shifter, and a multiply accumulate circuit.
Support for RTX2010 specific instructions can be added by using the
macro capability of fc.
Running fc
fc is invoked from the CLI (AMIGA) or MS-DOS by typing:
fc {-lreoxsd} {iaaa} {-tbbb} filename
fc recognizes nine command line options - l, x, e, o, r, s, d, i and
t. The l, x, s, d, i, and t options can be used in any combination
along with one of the other three options. All options must proceed
the source file name. The source file name is used as the base name
for generating output files. If the source file has an extension, the
extension is removed when forming the output file names.
The l option writes a disassembled listing of the program into
a file named filename.lst. The e option produces two EPROM output
files named filename.low and filename.hi instead of the default object
code output. The r option also generates EPROM files, but the data is
in a nonstandard bit shuffled format. The o option produces an
ascii/hex output file named filename.ols. The x option generates a
definition file named filename.x which can be used by other programs
to access words and variables in the compiled program. The s option
prints a symbol table into filename.sym. The d option turns on the
conditional compilation. The i option specifies a directory for
include files. Finally, the t option specifies a directory for
temporary files.
The l option generates an object code listing along with the
corresponding FORTH source code. This allows the programmer to check
how well the code was optimized. Often a fc listing file will include
a single line with thirty-two bits of code. This is done for long
literal instructions (which are 32 bits long) and when optimizations
bail-out due to a missing key instruction (often an alu operation,
fetch, or store). The listing file also includes any error messages.
The e option generates object code for hi and low byte EPROMs.
The output format is ASCII/HEX compatible with DATAIO format 51 (or
56) and is produced only if no errors occurred during compilation.
This format consists of a start code, one or more address declarations
and data blocks, followed by a stop code and checksum. After
compilation, fc will query the user for the EPROM base address. This
address is subtracted from the code address to produce an EPROM
relative address for output files. (If EPROMs are at 0x8000 hex, and
code starts at 0x8a00, the EPROM files would have address 0xa00 hex)
The r option generates the same type of files except the data for each
byte is bit reversed.
The o option generates object code in an ascii/hex format that
is compatible with the load block format described later. Each 16 bit
entity in the output file is represented by a four digit hexadecimal
number. The format consists of a configuration number (equal to the
compiler version), start address, data word count, control checksum,
object code, and block checksum. As with the other output formats,
code is only produced if there are no compilation errors.
The x option produces a file with information about the symbols
used in the program. This information includes the address of all
defined words and variables that are preceeded with the "xlink" label.
The "xcode" and "xheap" labels will cause the next available address
for code and/or variable allocation to be written to the file. The
information is provided in a form that is compatible with fc. That
is, another fc program can use a #include directive and include this
file and reference all the words and variables qualified with "xlink"
in the original program. If the xcode and xheap qualifiers are used
in the original program, any new variables and code produced will fit
continuously with those produced by the first program. This allows
programs to be compiled a piece at a time. For example, some code
could be developed for EPROM and some for RAM. The code developed for
RAM could use this feature to call routines in EPROM.
The s option produces a file containing the symbol table. Each
symbol is listed along with its hexadecimal value. A '*' is used to
indicate words and variables that are qualified with the "xlink"
label.
The d option compiles all lines in the input that start with a
'#' followed by a space. Normally these lines are ignored if the d
option is not specified. This allows debug code to be inserted into a
program conditionally depending on if the d option is specified at
compile time.
The i option specifies a directory to be searched for include
files. The directory name should immediately follow the i without any
spaces.
The t option specifies a directory for temporary files. The
directory name should immediately follow the t without any spaces.
Environment
fc supports two environment variables, FC_TEMP and FC_INCLUDE.
These environment variables define the path names for temporary fils
and include files respectively. Unfortunately with the AMIGA version,
the variables must be set using the Aztec "set" command supplied with
the compiler instead of the environment commands provided with
AmigaDos 1.3.
Setting the environment variables is not required but, can make
development more convenient and faster. The FC_TEMP variable should
be set to a directory on either the hard disk or ram disk. If this
variable is not set, temporary files will be written to the current
working directory. The FC_INCLUDE directory can be set to a directory
containing a collection of include files. This directory will be
searched for an include file if it can not be found in the current
working directory.
If an include search directory is specified with the i option,
it will override the directory specified by the FC_INCLUDE environment
variable. Also if a temporary file directory is specified with the t
option, it will override the directory specified by the FC_TEMP
environment variable.
fc Preprocessor
When fc is invoked, it processes the input file through a
preprocessor that expands all include files and macros and produces a
temporary file used for input to the compiler. The preprocessor also
supports the conditional compilation and removes comments and inserts
filename/line number stamps used as reference for producing error
messages. The preprocessor reserves the '#' character for specifying
commands and conditional compilation.
The preprocessor has two commands -- one for including files
and another for macro definitions. Either command must start in
column one and the keyword (include or macro) must be in lower case
letters. The entire command must fit on a single line.
The include file inserts the contents of another file into the
program. The filename follows "#include " and must not have any
spaces and may be qualified to access files outside the current
directory. The current directory is searched for the file. If it is
not found, the directory specified by the i option or the FC_INCLUDE
environment variable is searched.
The macro command takes the first text following #macro as the
macro name and the remainder of the line as the replacement text.
From then on, whenever the macro name is encountered, the replacement
text is substituted. Macro substitutions are case sensitive. Text
enclosed in double quotes and macro replacement text is not is not
searched for macros.
Examples of the include and macro commands:
#include filename
#macro name replacement text
#include monitor.x
#macro nip swap drop
Comments start with a '(' character and end with the
corresponding ')'. Any number of matched pairs of '(' and ')' may
appear within a comment. Comments may not begin or end on a line with
a #macro command.
Macros and include files are used to extend the capabilities of
the fc compiler. For example, fc does not directly recognize the
RTX2000 register instructions such as pc@, mr!, r>, yet these
instruction are easily added through the use of macros. Code libraries
and files filled with standard macros can be developed and accessed
easily with the include command. The i option or the FC_INCLUDE
environment variable allows a single directory to be set up as a
repository of macro definitions and code libraries.
In addition to supporting include and macro, the '#' character
is also used to support conditional compilation. A line in a source
file can be made conditional by inserting a '#' as the first character
of the line. At least one space or tab should follow the '#' before
any of the source code. If the program is compiled with the d option,
the conditional lines are included in the program. If it is compiled
without the d option, the conditional lines are not included in the
program.
fc Memory Allocation
The RTX2000 uses absolute addressing for branches and variable
reference and the compiler fixes the code and variable locations at
compile time. Code is compiled starting at a user defined address
(defaults to zero) and grows into higher memory. Variables are stored
in an area referred to as the heap which starts in high memory and
grows downward into low memory. The heap starting point can also be
defined (defaults to 10000 hex). When a variable is defined, the
current heap value is decremented (once for characters, twice for
words) to provide the variable address. (Variable allocation also
assures that all 16 bit variables are assigned even addresses.)
Typically the program starting address is assigned to the
lowest available memory address and the initial heap value is set to
one plus the highest memory address. fc does not support paged memory
so all variables and code are limited to a single 64K byte page.
Reserved Words and Characters
The words listed below are recognized by the compiler as being
special and can not be used as word or variables names. These words
also correspond to the operations supported by the compiler.
; : ) code heap constant variable cvariable array carray
xvariable xword nop ucode again begin drop dup else if dup?_if
next over repeat swap then until while nop exit not of( 0< 2*
2*c cU2/ c2/ U2/ 2/ N2* N2*c D2* D2*c cUD2/ cD2/ UD2* D2/ + -
and or xor nor nand +c -c xnor g@ g! u@ u! c@ c! @ ! @+
@- c@+ c@- !+ c!+ !- c!- ['] , { } byte word xcode xheap xlink
The '#' character is reserved for use by preprocessor commands,
conditional compilation, and as a character inside a string
definition. The '"' character is reserved for strings and should not
be used in word and variable definitions.
Numbers and Strings
fc recognizes base 10 and base 16 (hex) numbers. Numbers
preceeded by 0x are assumed to be hex and all other numbers are
assumed to be base ten.
fc also recognizes strings. Strings may be included in word
definitions and must be enclosed by double quotes (" "). Within the
source file, strings may not extend over more then one line of the
source file text. Special characters may be imbedded into strings by
using the following codes:
\n Carriage Return, Linefeed
\t Tab character
\b Backspace character
\r Carriage Return
\f Form Feed
\\ Backslash
\" Double quote
Word, Variable and Constant names
fc supports user defined names as long as 32 characters. In
general, all names should start with a non numeric character and be
kept to 20 characters or less. As a special case, names can start
with number provided that the second character is not a number or an
'x' or 'X'.
fc Syntax
An fc program consists of memory definitions, variable
definitions, constant definitions, data definitions, x list control,
and word definitions.
Memory definitions tell the compiler where to place code and
variables. They have the form:
number CODE ( sets next code output to address = number )
number HEAP ( sets heap = number )
The program may have any number of code and heap definitions,
but typically there will be only one heap definition and code
definitions will be used at the start of the program and when words
have to be compiled to specific addresses (like interrupt routines).
fc is very stupid and will allow you to overwrite code and variables
without the slightest warning so, be careful. Note that for each code
definition, fc will start a new output section.
Variable definitions tell the compiler to allocate memory for
variables and arrays. They have the form:
VARIABLE name ( declares name as a variable and allocates 1 word )
CVARIABLE name ( same as above but, allocates 1 byte )
number XVARIABLE name (assumes a variable name is at address = number)
number ARRAY name ( declares and allocates number words for name )
number CARRAY name ( declares and allocates number bytes for name )
The only difference between these definitions is the amount of
memory allocated. Once a variable is declared and allocated, fc knows
no difference between a cvariable, variable, array, carray, or
xvariable. The programmer is responsible for insuring that each
variable has the proper allocation for any operations performed in the
program.
Constant definitions tell the compiler about different kinds of
constants that can be used. They have the form:
number CONSTANT name ( declare name as a constant = number )
number XWORD name ( declare name as a word at address = number )
number UCODE name { declare name as a machine instruction = number )
Constant definitions allow the user to use symbolic names in
place of actual numbers. Whenever a constant is encountered in a
FORTH word definition, fc will generate code to push its declared
value onto the parameter stack. Whenever a xword is encountered, fc
will generate code for a subroutine call to the declared address.
When a ucode is encountered, the declared value is substituted
directly as a machine instruction.
Data definitions tell the compiler to place tables of words or
bytes into memory for use by programs. They have the form:
WORD name { datalist } ( place datalist into memory as words )
BYTE name { datalist } ( place datalist into memory as bytes )
where datalist is a list of constants separated by one or more spaces
and enclosed in braces. In the case of a WORD table, the datalist may
also include variable names or word names. (Variables must be
declared before they are included in the list) In these cases, the
address of the variable or word is placed in the list. For both word
and byte tables, the table name may be used by a program to push the
address of the table onto the stack (just like a variable name).
X list control words control generation of the link file that
is generated when fc is invoked with the -x option. The link file is
refered to as a .x file (its filename is sourcefilename".x"). The x
list control words appear below:
XCODE ( causes a CODE statement to be printed in the .x file )
XHEAP ( causes a HEAP statement to be printed in the .x file )
XLINK ( causes the next defined symbol* to be printed in the .x file)
*symbol should be a word, variable, or data definition
When XCODE appears in the source file, the .x file will contain
a CODE statement that defines code start at the next available
location when compilation finished. When XHEAP appears in the source
file, the .x file will contain a HEAP statement that defines the heap
as the next available memory location when compilation finished. When
XLINK appears before a word, variable, or data definition, it causes
the address of the following symbol to be included into the .x file as
an XWORD or XVARIABLE definition.
The x list control words should be used outside word and data
definition. The XCODE and XHEAP words need only appear once in a file
to generate the CODE and HEAP statements in the .x file. The XLINK
word will typically appear immediately before the definition of the
word, variable, or data definition to be installed in the .x file.
Word definitions have the same format as standard FORTH words.
They start with a colon followed by the word name, a body of
statements, and end with a semicolon. Word definitions do not have to
be ordered in a source file since words may be called before they are
defined.
fc recognizes only a small number of basic FORTH words (called
statements here) summarized below:
if ..statements.. then
if ..statements.. else ..statements.. then
?dup_if ..statements.. then
?dup_if ..statements.. else ..statements.. then
for ..statements.. next
begin ..statements.. again
begin ..statements.. until
begin ..statements.. while ..statements.. repeat
exit ( return to calling word )
drop
swap
dup
over
nop ( no operation )
not (ones complement )
+ - +c -c xnor nand xor or ( alu operations )
g@ ( fetch from RTX2000 ASIC bus )
g! ( store to RTX2000 ASIC bus )
u@ ( fetch from RTX2000 user memory space )
u! ( store to RTX2000 user memory space )
c@ c! @ ! ( character and word fetch and store )
@+ c@+ @- c@- ( character/word fetch with auto increment/dec )
!+ c!+ !- c!- ( character/word store with auto inc/decrement )
0< 2* 2*c ( shift instructions )
cU2/ c2/ U2/
2/ N2* N2*c D2*
D2*c cUD2/ cD2/
UD2* D2/
of( ( indicates RTX2000 streamed instruction )
['] ( pushes address of following word onto stack )
"text" ( defines string, pushes string address onto stack )
, ( causes immediate code production )
Many statements are compatible with the simple words in the
FORTH-83 standard. The compiler also supports RTX2000 specific
instructions, such as streamed instructions using OF( (a closing right
parenthesis must be included to define the end of the streamed
instruction) and u!, u@, g!, and g@ memory and ASIC bus access
statements. fc also supports all the RTX2000 shift instructions. (See
RTX2000 data sheet or Programmer's Reference Manual from Harris) Also
supported is the comma operator which causes the compiler to start a
new RTX machine instruction with the statement that follows. This
feature is sometimes useful for writing optimized programs. Many
simple FORTH words are easily generated using macros. See
accompanying programs for examples.
Most RTX2000 compilers and interpreters directly support a
number of RTX2000 specific words for manipulating registers on the
processor's internal ASIC bus. These words are not directly supported
by fc, however, the macro feature allows programmers to define macros
for these instructions. Typically a file filled with these
definitions can be referenced using an include command in the program
source file. Likewise, support for RTX2010 features can be added
using macro commands.
The basic statements (words) supported by the compiler are
described below. Stack diagrams are also provided to show the state
of the parameter stack before and after execution of the statement.
if ..statements.. then
The top value of the parameter stack is popped and evaluated.
If the value is nonzero ..statements.. are executed.
if ..statementsA.. else ..statementsB.. then
The top value of the parameter stack is popped and evaluated.
If the value if nonzero, ..statementsA.. are executed, otherwise
..statementsB.. are executed.
?dup_if ..statements.. then
If the top value of the parameter stack is nonzero, then
..statements.. are executed. Otherwise, the stack is popped and
discarded.
?dup_if ..statementsA.. else ..statementsB.. then
If the top value of the parameter stack is nonzero, then
..statementsA.. are executed. Otherwise, the stack is popped, the
zero is discarded, and ..statementsB.. are executed.
begin ..statements.. again
Executes ..statements.. again and again in an endless loop.
begin ..statements.. until
..statements.. are executed. The top of the parameter stack is
popped by "until" and evaluated. If the value is nonzero, execution
continues with the statement following "until", otherwise
..statements.. are executed again.
begin ..statementsA.. while ..statementsB.. repeat
..statementsA.. are executed, then "while" pops the top of the
parameter stack. If the value is zero, statements after repeat are
executed, otherwise ..statementsB.. are executed then..statementsA..
are executed again and the "while" test is performed again.
exit ( -- )
Execution is returned to the calling word.
drop ( x -- )
The top of the parameter stack is popped and discarded.
swap ( x y -- y x )
The top two items on the parameter stack are exchanged.
dup ( x -- x x )
The top value of the parameter stack is copied and pushed onto
the parameter stack.
over ( x y -- x y x )
The second value of the parameter stack is copied and pushed
onto the parameter stack.
nop ( -- )
No operation is performed.
not ( x -- y )
The top of the parameter stack is replaced by its ones
complement, y = not(x).
+ ( a b -- c )
The top two items on the parameter stack are replaced by their
sum, c = a + b.
- ( a b -- c )
The top two items on the parameter stack are replaced by their
difference, c = a - b.
+c ( a b -- c )
The top two items on the parameter stack are replaced by their
sum plus the value of the carry bit, c = a + b + carry bit.
-c ( a b -- c )
The top two items on the parameter stack are replaced by their
difference minus the ones complement of the carry bit, c = a - b -
not(carry).
xnor ( a b -- c )
The top two items on the parameter stack are replaced by the
result of a logical exclusive nor operation, c = a xnor b.
nand ( a b -- c )
The top two items on the parameter stack are replaced by the
result of a logical nand operation, c = a nand b.
xor ( a b -- c )
The top two items on the parameter stack are replaced by the
result of a logical exclusive or operation, c = a xor b.
or ( a b -- c )
The top two items on the parameter stack are replaced by the
result of a logical or operation, c = a or b.
g g@ ( -- x )
Fetches word x from address g on the processor's ASIC bus and
pushes onto parameter stack. g is a constant between zero and thirty-
one. See RTX2000/RTX2010 data sheet for register assignments.
g g! ( x -- )
Pops top of parameter stack and stores it into address g of the
processor's ASIC bus. g is a constant between zero and thirty-one.
See RTX2000/RTX2010 data sheet for register assignments.
u u@ ( -- x )
Fetches word x from user space address u and pushes onto the
parameter stack. u is a constant between zero and thirty-one.
u u! ( x -- )
Pops top of parameter stack and stores it into address u of the
user space. u is a constant between zero and thirty-one.
@ ( a -- d )
Replaces the top of the parameter stack with a word fetched
from the address a.
c@ ( a -- d )
Replaces the top of the parameter stack with the byte fetched
from memory address a.
! ( d a -- )
Pops two items from the top of the parameter stack and stores
word d into memory address a.
c! ( d a -- )
Pops two items from the top of the parameter stack and stores
data d as a byte into memory address a.
@+ ( a -- d a+2 )
Pop address a from top of parameter stack and fetch word d from
memory address a. Push d and a+2 onto parameter stack.
c@+ ( a -- d a+1 )
Pop address a from top of parameter stack and fetch byte d from
memory address a. Push d and a+1 onto the parameter stack.
@- (a -- d a-2 )
Pop address a from the top of the parameter stack and fetch
word d from memory address a. Push d and a-2 onto the parameter
stack.
c@- ( a -- d a-1 )
Pop address a from the top of the parameter stack and fetch
byte d from memory address a. Push d and a-1 onto the parameter
stack.
!+ ( d a -- a+2 )
Pop top two items off the parameter stack. Store word d into
memory address a then push value a+2 onto parameter stack.
c!+ ( d a -- a+1 )
Pop top two items off the parameter stack. Store d as a byte
into memory address a then push value a+1 onto the parameter stack.
!- ( d a -- a-2 )
Pop top two items off the parameter stack. Store word d into
memory address a then push value a-2 onto the parameter stack.
c!- ( d a -- a-1 )
Pop top two items off the parameter stack. Store d into memory
address a as a byte, then push value a-1 onto the parameter stack.
0< ( a -- b )
The top of the parameter stack is replaced by a value obtained
by extending the most significant bit to every bit in the word.
2* ( a -- b )
The top of the parameter stack is shifted left by one bit.
Zero is shifted into the lsb and the msb is shifted into the carry
bit.
2*c ( a -- b )
The top of the parameter stack is shifted left by one bit. The
carry bit is shifted into the lsb and the msb is shifted into the
carry bit.
cU2/ ( a -- b )
The top of the parameter stack is shifted right by one bit.
The carry bit is shifted into the msb and the lsb is discarded. Carry
bit is set to zero.
c2/ ( a -- b )
The top of the parameter stack is shifted right by one bit.
The carry bit is shifted into the msb and the lsb is shifted into the
carry bit.
U2/ ( a -- b )
The top of the parameter stack is shifted right by one bit.
Zero is shifted into the msb and the lsb is discarded. The carry bit
is set to zero.
2/ ( a -- b )
The top of the parameter stack is shifted right by one bit.
The msb remains unchanged and the lsb is discarded. The carry bit is
set to the value of the msb.
N2* ( a x -- b x )
The second value from the top of the parameter stack is shifted
left by one bit. Zero is shifted into the lsb. The carry bit is not
changed.
N2*c ( a x -- b x )
The second value from the top of the parameter stack is shifted
left by one bit. The carry bit is shifted into the lsb. The carry
bit is not changed.
D2* ( a b -- c d )
The top two items on the parameter stack are shifted left one
bit together as a thirty-two bit word. The msb of the top is shifted
into the carry bit. Zero is shifted into the lsb of the second item.
D2*c ( a b -- c d )
The top two items on the parameter stack are shifted left one
bit together as a thirty-two bit word. The carry bit is shifted into
the lsb of the second item and the msb of the top is shifted into the
carry bit.
cUD2/ ( a b -- c d )
The top two items on the parameter stack are shifted right one
bit together as a thirty-two bit word. The carry bit is shifted into
the msb of the top and the lsb of the second item is discarded. The
carry bit is set to zero.
cD2/ ( a b -- c d )
The top two items on the parameter stack are shifted right one
bit together as a thirty-two bit word. The carry bit is shifted into
msb of the top and the lsb of the second item is shifted into the
carry.
UD2* ( a b -- c d )
The top two items on the parameter stack are shifted right one
bit together as a thirty-two bit word. Zero is shifted into the msb of
the top and the lsb of the second item is discarded. The carry bit is
set to zero.
D2/ ( a b -- c d )
The top two items on the parameter stack are shifted right one
bit together as a thirty-two bit word. The msb of the top remains
unchanged and the lsb of the second item is discarded. The carry bit
is set to the state of the msb of the top item.
for ...statements... next
"for" causes a loop count (n) to be popped off the parameter
stack and ...statements... are executed n times. The iteration count
minus one can be obtained by reading the top value of the return stack
(using r@). ( Note: return instructions should not be executed inside
for...next loops since the iteration count is stored on the return
stack)
of( ( n -- )
Indicates an instruction, or instructions should be repeated.
A closing right parenthesis is required to indicate the range of
instructions to be repeated. The instruction, or instructions must
compile into a single 16 bit RTX2000 machine instruction or an error
will occur. The repetition count, n, is popped off the parameter
stack and the indicated instruction(s) are executed n + 1 times.
['] user_defined_word
When ['] proceeds a user defined word, the address of the word
is pushed onto the parameter stack instead of word being called as a
subroutine.
"String Text" ( -- a )
Pushes the address of the defined string onto the parameter
stack. The first byte of the string will be a count of the number of
characters that follow. (Assumes that the RTX2000 is configured for
Motorola type byte addressing)
,
Comma causes the compiler to start a new RTX2000 machine
instruction with the statement that follows. Commas allow programmers
to explicitly show how a program or portion of a program should be
partitioned into RTX2000 instructions. This allows the user to
directly control which statements are combined into RTX2000
instructions. Commas are used to force a desired optimization that
the compiler may not be able to achieve without the help of the
programmer. This feature is useful when writing super optimized code
when the programmer has a good understanding of the RTX2000
instruction set.
In addition to the statements (words) defined above, the
program may refer to user defined variables and words. All variables
must be defined before they are used, fc assumes all undefined
references to be subroutine calls (user defined words). Words do not
have to be defined before they are used but must be defined somewhere
in the program being compiled.
Comments may be placed anywhere in the program file and begin
with a space or newline followed by a left parenthesis. They may
extend over several lines and matching pairs of parenthesis may be
included inside comments. A right parenthesis matching the left
parenthesis that started the comment is used to end a comment.
fc Code Optimization
fc performs optimization at the source code level by collecting
the greatest number of FORTH statements that fit into one of its
syntax templates (coded using yacc). Many of these templates
correspond directly to single RTX2000 instructions while others
provide a default path in case single instruction optimization can't
be achieved. As the compiler collects FORTH statements one at a time
to fit to a given template, other templates must be available so that
code can be produced at any point (such as if a call statement is
encountered, which can not be combined with other statements). The
yacc generated parser insures that the longest template will be
chosen, incorporating as many FORTH instruction as feasible (given a
particular template list) into a single RTX2000 instruction. This
approach does not necessarily produce the shortest RTX2000 program
but, it is easy to code and allows easy incorporation of new
optimizations.
The comma operator can be used to force the compiler to abandon
looking for additional statements to pack into the machine
instruction. The statements following the comma will always start a
new RTX2000 machine instruction.
fc incorporates limited single token look ahead in its lexical
analyzer. Whenever a short constant or a "swap" is encountered, it
looks ahead to see what the next token is. If a "u@", "u!", "g@", or
"g!" follows a short constant, then the compiler must start a new
instruction beginning with the short constant. When a "swap" is
encountered that proceeds an alu operation, it can be incorporated
into the same instruction as the alu operation. Also if a "swap"
follows a "swap", both are ignored. This processing allows the
compiler to aggressively optimize out any number of swaps that might
precede an alu operation.
Default Object Code Format
Whenever fc is invoked without the e, r, or o option, and no
errors have occurred during compilation, fc will produce an object
file in the default format. This format consists of a sequence of one
or more sections of code. Each section consists of a start address, a
byte count, and RTX2000 instructions in binary format. The compiler
will produce a new section each time the code address is set using a
"code" definition, and whenever a current section becomes longer then
1024 bytes. The format of the default object code file appears below.
Each data entity is stored as a sixteen bit word.
Number of Sections
Starting address of section #1
Number of bytes in section #1
Data for section #1
{ Starting address of section #2 (if needed) }
{ Number of bytes in section #2 (if needed) }
{ Data for section #2 (if needed) }
....
DMSP Load Block Code Format
Whenever fc is invoked with the o option and no compilation
errors occur, code is produced in the DMSP Load Block Format. This
format consists of a series of 16 bit numbers represented in ascii/hex
format. Each code section produced by the compiler is packed into an
independent load block and all load blocks are concatenated into a
single file. The output file format appears below.
Configuration Number
Start Address for section #1
Data Word Count for section #1
Control Checksum for section #1 header
Data for section #1
Block checksum for section #1
{ Configuration Number for section #2 (if needed)}
{ Start Address for section #2 (if needed) }
{ Data Word Count for section #2 (if needed) }
{ Control Checksum for section #2 (if needed) }
{ Data for section #2 (if needed) }
{ Block checksum for section #2 (if needed) }
....
The configuration number for each load block will equal the
compiler version (for version 1.13, the configuration number = 000D
(hex)). The checksum algorithm is add and rotate 1 bit right. The
header checksum consists of the first three words in a block - the
configuration number, the start address, and the word count. The
block checksum covers the configuration number, the start address, the
word count, the control checksum, and all the data.
Error Messages
fc produces a limited number of rather general error messages.
All error messages, except for undefined word, are accompanied with
file name, line number, and text from the vicinity of the error.
After an error has occurred, the compiler continues until it completes
the file. When an error occurs within a word definition, the compiler
may skip over nearby errors and not report them. All errors will be
included in the list file if the l option is specified.
A missing colon in a word definition may cause the compiler to
print out a lot of "invalid statement" errors, one for each label
encountered in the word definition.
Limitations
fc stores generated code internally in a fixed size buffer.
Because of the defined length of the buffer, all programs should be
limited to 20K - 25K bytes or shorter.
Revision History
fc version 1.3 is the first version with intentional
distribution, however a short history of revision is given below:
1.2 First revision with all EPROM output options
1.2a Bug in l option fixed, l option no longer hangs up the
program for outputs that start with indented lines.
1.3 DMSP output option added.
1.4 var keyword changed to variable
cvar keyword changed to cvariable
xvar keyword changed to xvariable
inline keyword changed to ucode
return keyword changed to exit
To increase compatibility with Harris TFORTH compiler
Bug fix in parser to support heap definition of 0x10000
1.5 ['] word added
String support added
1.5a IBM only - bug in string support fixed
1.5b IBM only - bug in ['] fixed
1.6 DMSP option changed so that there are no spaces in file
Configuration number for this option is set to zero
instead of being prompted from the user
1.7 Fixed disassembly output for DUP d g! instructions
Added comma feature
Added word and byte data definitions
1.8 Corrected bug with instructions of the form DUP d alu-op
Changed configuration word to equal the complier version
1.9 Corrected bug with DMSP block load checksum
Printout size of each output section
Modified preprocessor so that strings are ignored
in macro substitutions
1.10 Fixed bug with constants used in "word" statements
Fixed preprocessor so that #, (, and ) may be used in
strings
Added xheap, xcode, and xlink statements
Added -s option
1.11 Fixed bug with @- and c@- instructions
Filename extensions allowed for source files
1.12 Added optimization for longlit OVER alu-op instruction
Added environment variables
Added conditional compilation
Fixed bug with disassembly of d g@ OVER alu-op
Changed command line print-out for code production
1.13 Speeded up preprocessor and symbol table
Added optimization for DUP d u@ aluop instruction
Allows use for larger programs
Added -i -t options
1.14 MS Windows Version (wfc)
Trialing '\' character no longer needed for specifing
path using envirnment variables
No longer prompts user to stop after set number of errors
References
The second reference is an excellent general reference for
scientific computing which also contains some search and sorting
algorithms. The other three references provide a good practical
introduction to compiler writing using coded examples.
Aho, A, V,, Sethi, R., and Ullman, J., D., Compilers
Principles, Techniques, and Tools, Addison-Wesley Publishing,
Massachusetts, 1988.
Flannery, B. P., Press, W. H., Teukolsky, S. A., and
Vetterling, W. T., Numerical Recipes in C The Art of Scientific
Computing, Cambridge University Press, New York, 1988.
Friedman, H. G. Jr., and Schreiner, A. T., Introduction to
Compiler Construction with UNIX, Prentice-Hall, New Jersey, 1985.
Kernighan, B. W., and Pike, R., The Unix Programming
Environment, Prentice-Hall, New Jersey, 1984.
Lloyd Linstrom
October 1992